Human Parsing


Human parsing is the process of identifying, segmenting, and categorizing different parts of a human body in an image or video such as head, shoulders, knees, and toes.

XMusic: Towards a Generalized and Controllable Symbolic Music Generation Framework

Add code
Jan 15, 2025
Viaarxiv icon

MC-VTON: Minimal Control Virtual Try-On Diffusion Transformer

Add code
Jan 07, 2025
Figure 1 for MC-VTON: Minimal Control Virtual Try-On Diffusion Transformer
Figure 2 for MC-VTON: Minimal Control Virtual Try-On Diffusion Transformer
Figure 3 for MC-VTON: Minimal Control Virtual Try-On Diffusion Transformer
Figure 4 for MC-VTON: Minimal Control Virtual Try-On Diffusion Transformer
Viaarxiv icon

Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs

Add code
Jan 08, 2025
Viaarxiv icon

LLM+AL: Bridging Large Language Models and Action Languages for Complex Reasoning about Actions

Add code
Jan 01, 2025
Viaarxiv icon

OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning

Add code
Dec 31, 2024
Figure 1 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 2 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 3 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 4 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Viaarxiv icon

Bridging Context Gaps: Enhancing Comprehension in Long-Form Social Conversations Through Contextualized Excerpts

Add code
Dec 28, 2024
Viaarxiv icon

Exploring More from Multiple Gait Modalities for Human Identification

Add code
Dec 16, 2024
Figure 1 for Exploring More from Multiple Gait Modalities for Human Identification
Figure 2 for Exploring More from Multiple Gait Modalities for Human Identification
Figure 3 for Exploring More from Multiple Gait Modalities for Human Identification
Figure 4 for Exploring More from Multiple Gait Modalities for Human Identification
Viaarxiv icon

RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios

Add code
Dec 19, 2024
Figure 1 for RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
Figure 2 for RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
Figure 3 for RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
Figure 4 for RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
Viaarxiv icon

Attention with Dependency Parsing Augmentation for Fine-Grained Attribution

Add code
Dec 16, 2024
Viaarxiv icon

Digestion Algorithm in Hierarchical Symbolic Forests: A Fast Text Normalization Algorithm and Semantic Parsing Framework for Specific Scenarios and Lightweight Deployment

Add code
Dec 18, 2024
Figure 1 for Digestion Algorithm in Hierarchical Symbolic Forests: A Fast Text Normalization Algorithm and Semantic Parsing Framework for Specific Scenarios and Lightweight Deployment
Figure 2 for Digestion Algorithm in Hierarchical Symbolic Forests: A Fast Text Normalization Algorithm and Semantic Parsing Framework for Specific Scenarios and Lightweight Deployment
Figure 3 for Digestion Algorithm in Hierarchical Symbolic Forests: A Fast Text Normalization Algorithm and Semantic Parsing Framework for Specific Scenarios and Lightweight Deployment
Figure 4 for Digestion Algorithm in Hierarchical Symbolic Forests: A Fast Text Normalization Algorithm and Semantic Parsing Framework for Specific Scenarios and Lightweight Deployment
Viaarxiv icon